Linear regression with partially mismatched data: local search with theoretical guarantees
نویسندگان
چکیده
Abstract Linear regression is a fundamental modeling tool in statistics and related fields. In this paper, we study an important variant of linear which the predictor-response pairs are partially mismatched. We use optimization formulation to simultaneously learn underlying coefficients permutation corresponding mismatches. The combinatorial structure problem leads computational challenges. propose simple greedy local search algorithm for that enjoys strong theoretical guarantees appealing performance. prove under suitable scaling number mismatched compared samples features, certain assumptions on data; our converges nearly-optimal solution at rate. particular, noiseless case, global optimal with convergence Based result, upper bound estimation error parameter. also approximate step allows us scale approach much larger instances. conduct numerical experiments gather further insights into results, show promising performance gains existing approaches.
منابع مشابه
Gapped Local Similarity Search with Provable Guarantees
We present a program qhash, based on q-gram filtration and high-dimensional search, to find gapped local similarities between two sequences. Our approach differs from past q-gram-based approaches in two main aspects. Our filtration step uses algorithms for a sparse all-pairs problem, while past studies use suffix-tree-like structures and counters. Our program works in sequence-sequence mode, wh...
متن کاملLocal linear regression for generalized linear models with missing data
Fan, Heckman and Wand (1995) proposed locally weighted kernel polynomial regression methods for generalized linear models and quasilikelihood functions. When the covariate variables are missing at random, we propose a weighted estimator based on the inverse selection probability weights. Distribution theory is derived when the selection probabilities are estimated nonparametrically. We show tha...
متن کاملLocal Linear Regression for Data with AR Errors.
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and...
متن کاملPartially linear censored quantile regression.
Censored regression quantile (CRQ) methods provide a powerful and flexible approach to the analysis of censored survival data when standard linear models are felt to be appropriate. In many cases however, greater flexibility is desired to go beyond the usual multiple regression paradigm. One area of common interest is that of partially linear models: one (or more) of the explanatory covariates ...
متن کاملPartially Linear Reduced-rank Regression
We introduce a new dimension-reduction technique, the Partially Linear Reduced-rank Regression (PLRR) model, for exploring possible nonlinear structure in a regression involving both multivariate response and covariate. The PLRR model specifies that the response vector loads linearly on some linear indices of the covariate, and nonlinearly on some other indices of the covariate. We give a set o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematical Programming
سال: 2022
ISSN: ['0025-5610', '1436-4646']
DOI: https://doi.org/10.1007/s10107-022-01863-y